class: center, middle, title-slide .title[ # Effective Visual Communication, Part 2 ] .author[ ### Claus O. Wilke ] .date[ ### last updated: 2024-06-11 ] --- ## Intro slide... --- class: center middle --- class: center middle ## 5. Making interactive plots --- ## Interactivity can help a lot with data exploration .center[
] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## We can highlight across two plots for added context .center[
] --- ## We can do this with the ggiraph package <img src = "interactive_plots/ggiraphlogo.svg", width = 20%, style = "position:absolute; top: 15%; right: 10%;"></img> - Straightforward integration into ggplot2 - Extremely lightweight, no running server required - But only works with HTML output --- ## Example 1: Simple scatter plot .tiny-font.pull-left.width-50[ ```r # iris_scatter <- ggplot(iris) + aes( Sepal.Length, Sepal.Width, color = Species ) + geom_point() iris_scatter ``` ] .pull-right.move-up-1em[ <img src="2024-06-20-part2_files/figure-html/iris-no-girafe-demo-out-1.svg" width="100%" /> .small-font[ regular **ggplot2** plot: hovering does nothing ] ] --- ## Example 1: Simple scatter plot .tiny-font.pull-left.width-50[ ```r *library(ggiraph) iris_scatter <- ggplot(iris) + aes( Sepal.Length, Sepal.Width, color = Species ) + * geom_point_interactive( * aes(tooltip = Species) ) *girafe( * ggobj = iris_scatter, * width_svg = 6, * height_svg = 6*0.618 *) ``` ] .pull-right[
.small-font[ **ggiraph** version: hovering displays species names ] ] --- ## Styling happens via Cascading Style Sheets (CSS) .tiny-font.pull-left.width-50[ ```r library(ggiraph) iris_scatter <- ggplot(iris) + aes( Sepal.Length, Sepal.Width, color = Species ) + geom_point_interactive( aes(tooltip = Species) ) girafe( ggobj = iris_scatter, width_svg = 6, height_svg = 6*0.618, * options = list( * opts_tooltip( *css = "background: #F5F5F5; color: #191970;" * ) * ) ) ``` ] .pull-right[
.small-font[ **ggiraph** version: hovering displays species names ] ] --- ## Select multiple points at once with `data_id` aesthetic .tiny-font.pull-left.width-50[ ```r library(ggiraph) iris_scatter <- ggplot(iris) + aes( Sepal.Length, Sepal.Width, color = Species ) + geom_point_interactive( * aes(data_id = Species), size = 2 ) girafe( ggobj = iris_scatter, width_svg = 6, height_svg = 6*0.618 ) ``` ] .pull-right[
] --- ## Select multiple points at once with `data_id` aesthetic .tiny-font.pull-left.width-50[ ```r library(ggiraph) iris_scatter <- ggplot(iris) + aes( Sepal.Length, Sepal.Width, color = Species ) + geom_point_interactive( aes(data_id = Species), size = 2 ) girafe( ggobj = iris_scatter, width_svg = 6, height_svg = 6*0.618, options = list( * opts_hover(css = "fill: #202020;"), * opts_hover_inv(css = "opacity: 0.2;") ) ) ``` ] .pull-right[
Again, styling via CSS ] --- ## Example 2: Interactive map of Texas .tiny-font[ ```r # get the data tx_census <- read_csv( "https://wilkelab.org/SDS375/datasets/US_census.csv", col_types = cols(FIPS = 'c') ) %>% filter(state == "Texas") %>% select(FIPS, pop2010) texas_income <- readRDS(url("https://wilkelab.org/SDS375/datasets/Texas_income.rds")) tx_counties <- left_join(texas_income, tx_census, by = "FIPS") tx_counties ``` ``` Simple feature collection with 254 features and 5 fields Geometry type: MULTIPOLYGON Dimension: XY Bounding box: xmin: -106.6456 ymin: 25.83738 xmax: -93.50829 ymax: 36.5007 Geodetic CRS: NAD83 First 10 features: FIPS county median_income moe pop2010 geometry 1 48001 Anderson 41327 1842 58458 MULTIPOLYGON (((-96.0648 31... 2 48003 Andrews 70423 6038 14786 MULTIPOLYGON (((-103.0647 3... 3 48005 Angelina 44223 1611 86771 MULTIPOLYGON (((-95.00488 3... 4 48007 Aransas 41690 3678 23158 MULTIPOLYGON (((-96.8229 28... 5 48009 Archer 60275 5182 9054 MULTIPOLYGON (((-98.95382 3... 6 48011 Armstrong 59737 4968 1901 MULTIPOLYGON (((-101.6294 3... 7 48013 Atascosa 52192 3005 44911 MULTIPOLYGON (((-98.80479 2... 8 48015 Austin 53687 3810 28417 MULTIPOLYGON (((-96.62085 3... 9 48017 Bailey 37397 8652 7165 MULTIPOLYGON (((-103.0469 3... 10 48019 Bandera 49863 7193 20485 MULTIPOLYGON (((-99.60332 2... ``` ] --- ## Part 1: Scatter plot .tiny-font.pull-left.width-50[ ```r texas_scatter <- tx_counties %>% ggplot(aes(pop2010, median_income)) + geom_point_interactive( aes(tooltip = county, data_id = county), na.rm = TRUE, size = 3.5 ) + scale_x_log10() + theme_bw() girafe( ggobj = texas_scatter, width_svg = 5, height_svg = 5 ) ``` ] .pull-right.width-45[
] --- ## Part 2: Map .tiny-font.pull-left.width-50[ ```r texas_county_map <- tx_counties %>% ggplot() + geom_sf_interactive( aes( tooltip = county, data_id = county ) ) + coord_sf(crs = 3083) + theme_void() girafe( ggobj = texas_county_map, width_svg = 6, height_svg = 6 ) ``` ] .pull-right.width-50[
] --- ## Combining both .tiny-font[ ```r girafe( ggobj = plot_grid(texas_scatter, texas_county_map), width_svg = 10, height_svg = 4.5 ) ``` ] .width-80[
] --- ## Now try it yourself Make an interactive map of the US states. -- .tiny-font[ ```r # get the data US_states <- readRDS(url("https://wilkelab.org/SDS375/datasets/US_states.rds")) US_states ``` ``` Simple feature collection with 51 features and 3 fields Geometry type: MULTIPOLYGON Dimension: XY Bounding box: xmin: -3683715 ymin: -2839538 xmax: 2258154 ymax: 1558935 CRS: NA First 10 features: GEOID name state_code geometry 1 01 Alabama AL MULTIPOLYGON (((1032679 -63... 2 04 Arizona AZ MULTIPOLYGON (((-1216674 -4... 3 05 Arkansas AR MULTIPOLYGON (((462619.4 -3... 4 06 California CA MULTIPOLYGON (((-2077630 -2... 5 08 Colorado CO MULTIPOLYGON (((-527710.6 3... 6 09 Connecticut CT MULTIPOLYGON (((1841099 622... 7 10 Delaware DE MULTIPOLYGON (((1762798 354... 8 11 District of Columbia DC MULTIPOLYGON (((1610777 321... 9 12 Florida FL MULTIPOLYGON (((1431218 -13... 10 13 Georgia GA MULTIPOLYGON (((1339965 -64... ``` ] --- ## Now try it yourself Non-interactive version of the plot. <br> (You can find the code for the interactive version [here.](https://wilkelab.org/SDS375/slides/interactive-plots.html)) .tiny-font.pull-left.width-50[ ```r US_states %>% ggplot() + geom_sf() + theme_void() ``` ] .pull-right.width-50.move-up-4em[ <!-- --> ] --- class: center middle ## 6. Avoid overplotting --- ## Be aware of points plotted exactly on top of one another .center[ <img src="2024-06-20-part2_files/figure-html/mpg-cty-displ-solid-1.svg" width="55%" /> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) -- Technical term for this problem: overplotting --- ## Partial transparency helps highlight overlapping points .center[ <img src="2024-06-20-part2_files/figure-html/mpg-cty-displ-transp-1.svg" width="55%" /> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## A little jitter shows overlaps even more clearly .center[ <img src="2024-06-20-part2_files/figure-html/mpg-cty-displ-jitter-1.svg" width="55%" /> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## But don't jitter too much .center[ <img src="2024-06-20-part2_files/figure-html/mpg-cty-displ-jitter-extreme-1.svg" width="55%" /> ] ??? Figure redrawn from [Claus O. Wilke. Fundamentals of Data Visualization. O'Reilly, 2019.](https://clauswilke.com/dataviz) --- ## Further reading - Fundamentals of Data Visualization: [Chapter 18: Handling overlapping points](https://clauswilke.com/dataviz/overlapping-points.html) - Fundamentals of Data Visualization: [Chapter 20: Redundant coding](https://clauswilke.com/dataviz/redundant-coding.html) - Fundamentals of Data Visualization: [Chapter 21: Multi-panel figures](https://clauswilke.com/dataviz/multi-panel-figures.html)